1
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Leveraging Local Data for Public Health
with SMART Scatter
Session 123, February 13, 2019
Kevin Gormley, Principal Data Scientist
The MITRE Corporation
Dawn Heisey-Grove, Principal Epidemiologist
The MITRE Corporation
Approved for Public Release; Distribution Unlimited. MITRE Public Release Case Number 18-4403.
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
2
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Kevin Gormley, PhD
Has no real or apparent conflicts of interest to report.
Conflict of Interest
3
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Dawn Heisey-Grove, MPH
Has no real or apparent conflicts of interest to report.
Conflict of Interest
4
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Learning Objectives
Background
SMART Scatter
Results
Further Research
Agenda
5
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Describe how local administrative data can inform public health
policy, surveillance and intervention activities by revealing relevant
community characteristics
Explain how research partnerships between local government,
academia and industry can improve population health
List different geographic levels of data that may be available to
understand a local population
Interpret what the areas of high risk on a "heat map" represent for
a given health issue
Learning Objectives
6
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Background: Health and Wellness are
Driven by Local Factors
Communities drive improvement
Interventions target local at-risk populations
Local factors influence policy success and impact
Health is local
7
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Current Problems Using Local Data
Data silos exist at all levels of
government & impede
development of a complete model
of a community’s well-being
Federal, state, and local government
administrative data are not
integrated with surveys and other
healthcare data
8
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Socio-Ecological Relationships
Associated with Domestic Violence
Policy,
Systems, &
Society
National, state, local policy; education
of women; public awareness; firearms
policies; emergency systems
Neighborhoo
d &
Community
Neighborhood environment; culture of
violence; access to services; quality of
housing; drug use; social isolation
Interpersonal
& Family
Family relationships; patriarchal
culture; role of women; alcohol/drug
use; poverty; employment
Individual Individual attitudes, behaviors, health,
social history
9
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Domestic Violence is Significantly
Under-Reported
Strong reliance on survey data and reports to law enforcement
and Child Protective Services
There is significant opportunity to improve case finding
methodologies
Type of Domestic Violence
Estimated Underreporting
Child abuse and neglect
45
-76% of cases are not reported
1-3
Elder abuse
For every known case,
24 are
unknown
4
Intimate partner violence (against females)
5
Physical assault
3 of 4
cases not reported
Rape
4 of 5
cases not reported
Stalking
Half
not reported
1
The National Center for Fatality Review and Prevention. Child Abuse and Neglect. https://www.ncfrp.org/reporting/child-abuse-and-neglect/
2
U.S. Government Accountability Office. (2011). Child Maltreatment: Strengthening National Data on Child Fatalities Could Aid in Prevention. GAO-11-599. https://www.gao.gov/products/GAO-11-599
3
Commission to Eliminate Child Abuse and Neglect Fatalities. (2016). Within our reach: A national strategy to eliminate child abuse and neglect fatalities. Washington, DC: Government Printing Office.
4
National Center on Elder Abuse. Statistics and Data. https://ncea.acl.gov/whatwedo/research/statistics.html
5
National Institute of Justice and the Centers of Disease Control and Prevention, “Extent, Nature and Consequences of Intimate Partner Violence: Findings from the National Violence Against Women Survey,(2000).
10
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Why Use Neighborhood/Community
Factors?
Research demonstrates that community-
level factors significantly contribute to
domestic violence risk, after taking into
account individual-level factors
Findings are mixed about the
influence of community-level
factors on risk
Only one study has explored the
relationship using geographic areas
smaller than census tract
Majority are conducted at census
tract level
None at Census Block Group level
*Beyer, K., Wallis, A.B., Hamberger, L.K. Neighborhood environment and intimate partner violence: A systematic review. (2015). Trauma, Violence, Abuse; 16(1): 16-47.
Censu
s
Block
Group
Census
Tract
County
ZIP Code
~600-3000
people
~3000-30,000+
people
~1200-8000
people
~10,000-
200,000+
11
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Simulated Multivariate Adaptive Regression
Technique (SMART) Scatter Overview
Idea: Developed by Virginia Tech and MITRE, SMART
Scatter uses multiple imputation methods to estimate
household-level risk factors, which are refined by
resampling to better match aggregate level distributions
from the relevant geographic regions.
Findings: We can develop a comprehensive perspective
of the public health burden in the community.
Impact: Help public health staff discover new patterns of
high-risk areas for policy interventions.
12
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
SMART Scatter Data Types
Data Type Example
Point Estimates
Information about an individual
household
Marginal Distributions
How income is distributed within a
Census Block Group
Joint Distributions
US Census American Community
Survey data showing how variables
such as income and educational
attainment co-vary
Events
Police reports of a specific point
location, time and event type
13
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Uses information from:
Real estate tax records
Geography, county Geographic Information System (GIS) data
US Census: American Community Survey (ACS) Public Use Microdata Sample (PUMS)
US Census: ACS summaries by Block Group (~600-3,000 people)
Household income, property value, # bedrooms, year residence built
County apartment surveys
SMART Scatter Uses Multiple Data Sources to Fill
in Demographic Detail at the Household Level
Local Tax Records, Surveys,
Distributions by block group
Random Survey Sample
0
300
600
900
1200
0 500000 1000000 1500000
House Value
Sqrt Income
0 100 200 300 400
0
40
80
income ($1000)
count per $25000
14
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Imputed Variables have Conditional Distributions
Consistent with those Observed in the PUMS* Data
0
300
600
900
1200
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Rooms Per House
Sqrt Income
Source
Imputed
PUMS
*US Census: American Community Survey (ACS) Public Use Microdata Sample (PUMS)
15
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Existing Imputation Methods, Estimates Do Not
Match Distributions by Block Group
Household Income Distribution by Block Group
Block Group ID
MICE* imputation
ACS table
* Multivariate Imputation by Chained Equations (MICE), a statistical package for R software
16
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Imputations that Incorporate ACS Marginal Table
Information via SMART Scatter Fit Distributions Better
Block Group ID
SMART Scatter
Imputation
ACS table
Household Income Distribution by Block Group
17
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Model Features
Model Features
(Independent
variables)
Source
Description of Source
Variable
SMART Scatter
Application
Basic Model
Application
Median income
PUMS/ACS
Median income by house
and by census block group
Used for imputation
Modeling feature
Property tax
CORELogic/
PUMS
Tax levied on the property.
Used for imputation
Not used*
Number of rooms in
household
CORELogic/
PUMS
Household level continuous
Imputed modeling
feature
Modeling feature
Multigenerational
household
PUMS
Flag for the presence of
more than two generations
living in the house
Imputed modeling
feature
Not used*
Single parent
PUMS
Flag for a single parent
household
Imputed modeling
feature
Not used*
Household size
(people)
PUMS
Number of people living in
the house
Imputed modeling
feature
Not used*
Unmarried partner
PUMS
Flag for cohabitating adults
who are unmarried
Imputed modeling
feature
Not used*
Special needs child in
household
PUMS
Flag for the presence of a
special needs child
Imputed modeling
feature
Not used*
Military service
member in household
PUMS
Flag for the presence of a
military service member
Imputed modeling
feature
Not used*
Rate of drug
-related
calls to police by
census block group
County police
data
Extracted from police data
based on DRUG flag
Modeling feature
Modeling feature
*Data element could not be applied at
the level of census block group.
18
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Model Features Rationale
Model Features
Supporting Literature
Median income
Brown et al., “A Longitudinal Analysis of Risk Factors for Child Maltreatment.” (1998);
Lachs and Pillemer, “Elder Abuse.” (2015);
Stith et al., “Intimate Partner Physical Abuse Perpetration and Victimization Risk Factors.” (2004);
Tjaden and Thoennes, “Extent, Nature, and Consequences of Intimate Partner Violence.” (2000)
Property tax
Shanahan et al., “The within Poverty Differences in the Occurrence of Physical Neglect.” (2017)
Number of rooms in household
Johannesen and LoGiudice, “Elder Abuse.” (2013);
National Research Council (US) Panel to Review Risk and Prevalence of Elder Abuse and Neglect,
Elder Mistreatment. (2003)
Multigenerational household
Brown et al., “A Longitudinal Analysis of Risk Factors for Child Maltreatment.” (1998);
Cancian, Slack, and Yang, “The Effect of Family Income on Risk of Child Maltreatment.” (2010)
Single parent
Brown et al., “A Longitudinal Analysis of Risk Factors for Child Maltreatment.” (1998)
Household size (people)
Brown et al., “A Longitudinal Analysis of Risk Factors for Child Maltreatment.” (1998);
Johannesen and LoGiudice, “Elder Abuse.” (2013);
Lachs and Pillemer, “Elder Abuse.” (2015)
Unmarried partner
Moffitt and Caspi, “Findings About Partner Violence From the Dunedin Multidisciplinary Health and
Development Study.” (1999);
Tjaden and Thoennes, “Extent, Nature, and Consequences of Intimate Partner Violence.” (2000)
Special needs child in
household
Children’s Bureau, “Child Maltreatment 2012.” (2012);
Military service member in
household
National Intimate Partner and Sexual Violence Survey Technical Report, 2010
Rate of drug
-related calls to
police by census block
Capaldi et al., “A Systematic Review of Risk Factors for Intimate Partner Violence.” (2012);
Kelleher et al., “Alcohol and Drug Disorders among Physically Abusive and Neglectful Parents in a
Community-Based Sample.” (1994)
19
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Posterior Realizations of Household Variables
Data from:
Real estate tax records
Geography, county GIS data
US Census: American Community
Survey (ACS) Public Use Microdata
Sample (PUMS)
US Census: ACS summaries by Block
Group (~600-3000 people) of:
household Income, property
value, # bedrooms, year
residence built
County apartment surveys
Realizations of:
Household income
Number of rooms
Household size
Multigenerational
household
Single parent
Unmarried partner
Special needs child
Military service
SMART
Scatter
20
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Outcome Variable: Domestic
Disturbance Calls to County Police
Police data for 2015-2017 included calls categorized by:
Domestic/family disturbance this was the largest relevant
category and used for modeling purposes
Homicide/Aggravated Assault/Assault Family
Cruelty toward child
Other family offense
Limitations:
Calls without address are recorded at the local courthouse address
Correctly identifying which calls to include in the model is a
challenge (specificity and sensitivity remain a challenge)
Still underestimating domestic violence prevalence in county
because we are only looking at one data source for the domestic
violence
21
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Comparing Logistic Regression Coefficients
from SMART Scatter and Baseline Model
o SMART Scatter
estimates
Baseline estimate
The 3 most significant variables were substance abuse rate in block group
(p < 0.01), house size (p < 0.01), and unmarried partner (p < 0.05).
Unmarried partner
Special needs child
Single parent
Number of rooms
Multigenerational household
Military service
Household income
Household size
Substance abuse rate
(Intercept)
22
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Results: Comparison of Simple Model with
SMART Scatter Model
Simple Model SMART Scatter Model
Deviance = −2ℓ(𝛽; 𝑦)
AIC = 2𝑘 2ℓ(𝛽; 𝑦)
k = # parameters
Dev
0
= 1022.74
AIC
0
= 1623.83
Dev
1
= 846.88
AIC
1
= 1459.96
Dev = 175.87
AIC = 163.87
favors
SMART
Scatter
Random “noise” added to the predicted values.
County staff received heat maps with full accuracy.
23
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Interactive Dashboard: Can Explore Risk
Factor Estimates and Predictions by
Census Block Group (or other geography)
Initial results show predictions and risk factors’
median values and z-scores (standard
deviations from the mean) for a given Census
Block Group.
Z-score indicates how
many standard
deviations higher or
lower than the mean
this block group
compares to all
county block groups.
[Household Income of
-1.00” indicates, 1
standard deviation
below the mean.]
Hovering mouse over block group pops up a table:
Block Groups with < 50 households are not drawn.
Random “noise” added to the predicted values.
County staff received heat maps with full accuracy.
24
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Health policy simulation
research using agent-based
modeling to explore how policy
interventions influence the long-
term impacts of domestic
violence (child maltreatment,
intimate partner abuse, and
elder abuse)
Current Research Future Research
Apply SMART Scatter to more
communities, especially rural
counties, which have fewer
census block groups spread
over a large area
Apply SMART Scatter to other
public health issues, such as
asthma and diabetes
Cycle of Violence
25
©2018 The MITRE Corporation. ALL RIGHTS RESERVED
Please complete the online
session evaluation!
Questions
Dawn Heisey-Grove
heiseygroved@mitre.org
Kevin Gormley
kgormley@mitre.org
MITRE’s mission-driven teams are dedicated to solving problems
for a safer world. Through our federally funded R&D centers and
public-private partnerships, we work across government to tackle
challenges to the safety, stability, and well-being of our nation.
Learn more at www.mitre.org.
We thank our partners at the
Virginia Tech Biocomplexity Institute